Repeated exposure to either consistently spatiotemporally congruent or consistently incongruent audiovisual stimuli modulates the audiovisual common-cause prior

To estimate an environmental property such as object location from multiple sensory signals, the brain must infer their causal relationship. Only information originating from the same source should be integrated. This inference relies on the characteristics of the measurements, the information the sensory modalities provide on a given trial, as well as on a cross-modal common-cause prior: accumulated knowledge about the probability that cross-modal measurements originate from the same source. We examined the plasticity of this cross-modal common-cause prior. In a learning phase, participants were exposed to a series of audiovisual stimuli that were either consistently spatiotemporally congruent or consistently incongruent; participants’ audiovisual spatial integration was measured before and after this exposure. We fitted several Bayesian causal-inference models to the data; the models differed in the plasticity of the common-source prior. Model comparison revealed that, for the majority of the participants, the common-cause prior changed during the learning phase. Our findings reveal that short periods of exposure to audiovisual stimuli with a consistent causal relationship can modify the common-cause prior. In accordance with previous studies, both exposure conditions could either strengthen or weaken the common-cause prior at the participant level. Simulations imply that the direction of the prior-update might be mediated by the degree of sensory noise, the variability of the measurements of the same signal across trials, during the learning phase.


Contents
References 28 2/38 S1 Results for the bimodal spatial-discrimination task In the first preparatory experiment, we measured participants' modality-specific biases in auditory relative to visual spatial perception. Specifically, participants determined whether an auditory test stimulus, the location of which varied within a wide range, was to the left or right of a visual standard stimulus, the location of which was predetermined (±4 • or ±12 • ). We fitted four cumulative Gaussian distributions to the binary judgments as a function of auditory stimulus location, one for each visual stimulus location, with a common lapse rate, which was constrained to be less than 6% 1 . For each psychometric function, we calculated the PSE, and fitted a linear regression line to those four PSEs as a function of visual stimulus location (Fig. 2B, right panel). The estimated slopes for all participants were significantly smaller than 1 (mean = 0.593 • , SEM = 0.042 • , Fig. S1). Thus, the auditory stimuli were perceived as shifted towards the periphery relative to the visual stimuli, which is in line with previous findings [2][3][4][5] . Ten out of 17 participants (outliers excluded; see S2 Excluding outlier participants) showed significant positive intercepts (mean = 2.106 • , SEM = 0.469 • , Fig. S1), indicating that they perceived the auditory stimuli as shifted to the left relative to visual stimuli.  Figure S1. Results for the bimodal spatial-discrimination task. The slope (x-axis) and the intercept (y-axis) of the linear regression for PSEs on visual stimulus location. Grey circles: individual data; yellow circle: group mean; grey error bars: 95% bootstrapped confidence intervals; black error bars: standard error of the mean (SEM). Vertical and horizontal dashed lines correspond to the absence of proportional and constant perceptual biases of audition relative to vision, respectively.

S2 Excluding outlier participants
In the unimodal spatial-localization task (preparatory experiment 3), participants localized either unimodal visual stimulus, presented at either ±12 • or ±4 • , or unimodal auditory stimuli, the locations of which were identified individually for each participant in the bimodal spatial-localization task (Preparatory Experiment 1). The localization responses from this experiment enabled us to directly measure modality-specific biases, and therefore, provided us a chance to re-evaluate whether the participant-specific auditory stimulus locations truly matched the four visual stimulus locations. To this aim, we computed the mean auditory and visual localization responses (Fig. S2A), and fitted two linear regressions, one for the mean auditory responses and the other for the mean visual responses, on visual stimulus locations (Fig. S2B). The slope of the two linear regression, which reflects the proportional bias in spatial perception, should match if the participant-specific auditory stimulus locations were perceived as co-located with the four pre-selected visual stimulus locations; the same holds true for the intercept, which reflects the constant bias in spatial perception. Therefore, we computed the mean absolute value of the difference between the slopes (M = 0.313 • , SD = 0.246 • ), and the intercepts (M = 2.149 • , SD = 1.641 • ). We excluded participants who showed a difference of the slopes or the intercepts greater than 3 SDs or smaller than -3 SDs. Three participant met these criteria, and thus were excluded from further analysis and model fitting (Fig. S2C). With these outliers excluded, the mean absolute value of the difference between slopes was reduced (M = 0.243 • , SD = 0.169 • ) as was the difference between intercepts (M = 1.965 • , SD = 1.053 • ).   Table S1. Linear mixed-effects model (LMM) analysis for the auditory ventriloquism effects from the pre-and post-learning phases of the main experiment. Test statistics reveal whether the estimated beta is significantly different from 0 (greater absolute t-value indicates that β is further away from 0) and the results of model comparison between the null and alternative models (larger χ 2 indicates stronger evidence for the alternative models).   )   Table S3. Generalized linear mixed-effects model (GLMM) analysis for the binary unity judgments from the pre-and post-learning phases of the main experiment. Test statistics reveal whether the estimated beta is significantly different from 0 (greater absolute z-value indicates that β is further away from 0) and the results of model comparison between the null and alternative models (larger χ 2 indicates stronger evidence for the alternative models).

S3.3 Unity judgments
S4 Individual-level empirical data and model predictions

Congruent Incongruent
Empirical data Model predictions

Congruent Incongruent
The proportion of reporting a common cause The proportion of reporting a common cause The proportion of reporting a common cause The proportion of reporting a common cause The proportion of reporting a common cause The proportion of reporting a common cause The proportion of reporting a common cause The proportion of reporting a common cause  Figure S4. Empirical data (dots) and model predictions (lines) for all the participants from the second group in Figure 5B.

10/38
The proportion of reporting a common cause The proportion of reporting a common cause The proportion of reporting a common cause The proportion of reporting a common cause The proportion of reporting a common cause The proportion of reporting a common cause

S5.1 An alternative decision strategy for implicit causal inference -model selection
We considered model selection as an alternative decision strategy for deriving the final auditory and visual location estimates 6 . According to model selection, the final location estimateŝ AV l ,A is the location estimate conditioned on a common causeŝ AV l ,A,C=1 if the posterior probability of a common cause is greater than that of separate causes. Otherwise, the final location estimatê s AV l ,A equals the location estimate conditioned on separate causesŝ AV l ,A,C=2 , i.e., Analogously,

S5.2.1 Optimal posterior-based unity judgments
The first alternative decision strategy we considered is that the observer makes a decision about whether a co-occurring auditory and visual stimulus originate from a common source based on the posterior probability of a common cause, P(C = 1|m AV l ,A , m AV l ,V ). More specifically, if this probability is greater than 0.5, then the observer intends to report 'common cause'. In other words, I C=1 is what the observer intends to report. However, as described in the main text, we assume that the observer lapses occasionally during the task with a lapse rate λ unity . Therefore, even when the observer intends to report 'common cause', he or she will still mistakenly report 'separate causes' with a probability of λ unity , which results in reporting 'common cause' with a probability of 1 − λ unity . In other words, The lapse rate is modeled in the same way for the following alternative decision strategy.

S5.2.2 measurement-based (heuristic) unity judgments
We also considered a heuristic strategy as an alternative, which is based directly on the difference between the auditory and visual measurements. If the difference between them is below ε, then the observer intends to report 'common cause'. That is, S5.3 Overview of models with different decision strategies and assumption about the plasticity of the common-cause prior As described in the main text, we tested three models with different assumptions about the plasticity of the common-cause prior (high-plasticity, short-lasting changes; high-plasticity, long-lasting changes; and no-plasticity). Each of these three models was tested five times with different assumptions about the decision strategy for deriving the final location estimates as well as determining the unity judgments. Specifically, we considered model averaging paired with estimate-based (heuristic), posterior-based, or measurement-based (heuristic) unity judgments. We also considered model selection paired with either posterior-based, or measurement-based (heuristic), but not estimate-based (heuristic) unity judgments. This is because having both assumptions makes the model logically circular: the estimates are based on the unity judgment, which, in turn, is based on the estimates. Therefore, we tested a total of five decision strategies for each of the three models for the common-cause prior, resulting in a total of 15 models (Table S5.3). Table S4. Summary of the total number of free parameters for all the tested models. Among models with the same assumption for the plasticity of the common-cause prior, those assuming estimate-based and measurement-based unity judgments have one extra free parameter (ε) than those assuming posterior-based unity judgments.

S5.4 Model likelihoods
For a given model M, we fitted the model jointly to the forced-choice data from the bimodal spatial-discrimination task X 1 , the localization responses from the unimodal spatial-localization task X 2 , and the localization responses and unity judgments from the bimodal spatial-localization task during the pre-and post-learning phases of both congruent and incongruent conditions X 3 . The localization responses from the localization practice task were fitted separately to reduce the number of free parameters estimated simultaneously. The data from the learning phase were excluded as they are sequentially dependent, making them computationally challenging to fit. All models were fit using a maximum-likelihood procedure. That is, a set of free parameters Θ was chosen to maximize the log likelihood The candidate models only varied with respect to Θ 3 .

S5.4.1 Model likelihoods given the bimodal spatial-discrimination task
For each trial t, participants were presented with an auditory test stimulus at location s A,o(t) , and a visual standard stimulus at location s V,w(t) , in a random order. Participants were then asked to indicate whether the auditory stimulus was located to the left, I A-right,w(t),o(t) = 0, or to the right, I A-right,w(t),o(t) = 1, compared to the location of the visual stimulus. For each such trial, the likelihood of a model M and a candidate parameter set Θ 1 given the response I A-right,w(t),o(t) is where p I A-right,w(t),o(t) =1 is defined in Eq. 15. Thus, the log likelihood given the responses across all T 1 trials is depends on the bias parameters a A and b A , the parameters of the supra-modal prior over locations µ P and σ P , the measurement noise σ A and σ V , as well as the lapse rate λ AV . The final set of free parameters that were constrained by the forced-choice data in this task was

S5.4.2 Model likelihoods given unimodal localization responses
For each trial t, participants were asked to localize either a unimodal visual or auditory stimulus, resulting in a cursor location setting r A,u(t) or r V,w(t) . The localization responses from this task were modeled as Gaussian distributions. From these distributions, we can compute the likelihood of a model M and a candidate parameter set Θ 2 as the Gaussian probability density function in Eq. 13, evaluated at the observed localization response r A,u(t) or r V,w(t) : , and P(r V,w(t) |M, The log-likelihood is the sum of the log-likelihoods across trials: The log-likelihood depends on µŝ A,u(t A ) and µŝ A,u(t A ) , which in turn depend on the bias parameters a A and b A , the parameters of the supra-modal prior over locations µ P and σ P , the standard deviation of measurement distributions σ A and σ V , as well as the perception-unrelated response noise σ r , which was estimated from the data from the localization-practice task. Therefore, the set of parameters constrained by the localization responses in this task is

S5.4.3 Model likelihoods given bimodal spatial-localization responses and unity judgments
The main experiment consisted of two conditions (i = 1: congruent; i = 2: incongruent) and two phases ( j = 1: pre-learning; j = 2: post-learning). For each trial t of session (i, j), participants were presented with an audiovisual stimulus pair AV l (l indexes audiovisual pair, l ∈ {1, 2, ..., 16}). After stimulus presentation, participants first localized a stimulus of a cued modality, which resulted in a cursor location setting r i, j,AV l(t) ,A or r i, j,AV l(t) ,V . Participants then made a unity judgment, which resulted in a binary response I C=1,i, j,AV l(t) (1: same source; 0: different sources). The overall log-likelihood of M and Θ M (where we designate the version of Θ 3 used for model M as Θ M ) is the log-likelihood summed over all localization responses and unity judgments: P(m AV l ,A |s i, j,AV l )P(m AV l ,V |s i, j,AV l )dm AV l ,A dm AV l ,V + log P(r i, j,AV l(t) ,V |m AV l ,A , m AV l ,V )P(I C=1,i, j,AV l(t) |m AV l ,A , m AV l ,V ) P(m AV l ,A |s i, j,AV l )P(m AV l ,V |s i, j,AV l )dm AV l ,A dm AV l ,V . (S11) The log-likelihood depends on the final location estimatesŝ AV l ,A andŝ AV l ,V , which in turn depend on the bias parameters a A and b A , the parameters of the supra-modal prior over locations µ P and σ P , the standard deviation of measurement distributions σ AV,A and σ AV,V under bimodal presentation mode, as well as two lapse rates λ unity (one for each condition), the set of common-cause priors {p c=1 } that are model-specific, and the perception-unrelated response noise σ r , which was estimated using the data from the localization-practice task. Additionally, the log-likelihood depends on ε, the internal criterion for determining the unity judgment. Therefore, the set of parameters constrained by the localization responses in this task is

S5.4.4 Model likelihoods of models that assume different decision strategies
As described in the previous section, the likelihood of models that assume model averaging as the decision strategy for deriving final location estimates is the joint likelihood of a pair of auditory and visual sensory measurements given the localization response registered by the cursor and the unity judgment P(r AV l , I C=1,AV l |m AV l ,A , m AV l ,V ), multiplied by the joint probability of the two measurements P(m AV l ,A , m AV l ,V |s AV l ), integrated over the two variables inaccessible to the experimenter -the visual and auditory sensory measurements. Since the models assume model averaging, the localization response r AV l and the unity judgment I C=1,AV l are conditionally independent regardless of which decision strategy the unity judgment is based on. Thus, P(r AV l , I C=1,AV l |m AV l ,A , m AV l ,V ) can be rewritten as the product of the likelihood of a pair of auditory and visual sensory measurements given the localization response P(r AV l |m AV l ,A , m AV l ,V ), and the likelihood given the unity judgment P(I C=1,AV l |m AV l ,A , m AV l ,V ). However, this conditional independence does not apply to models that assume model selection.

14/38
Based on model selection, the final location estimate depends on the unity judgment. Therefore, Eq. S11 is modified as = log P(r i, j,AV l(t) ,A |I C=1,i, j,AV l(t) , m AV l ,A , m AV l ,V )P(I C=1,i, j,AV l(t) |m AV l ,A , m AV l ,V ) P(m AV l ,A |s i, j,AV l )P(m AV l ,V |s i, j,AV l )dm AV l ,A dm AV l ,V + log P(r i, j,AV l(t) ,V |I C=1,i, j,AV l(t) , m AV l ,A , m AV l ,V )P(I C=1,i, j,AV l(t) |m AV l ,A , m AV l ,V ) P(m AV l ,A |s i, j,AV l )P(m AV l ,V |s i, j,AV l )dm AV l ,A dm AV l ,V , where and analogously, (S14) 15/38 S6 Mathematical derivations

S6.1 The center and the spread of the distribution for localization responses
A presentation of an auditory stimulus at location s A,u , where u indexes participant-specific auditory stimulus location, results in an internal measurement m A,u with variability of σ A 2 . The estimate of the remapped location of this stimulus is the average of the internal measurement and the mean of the spatial prior µ P , each weighted by their relative reliabilities, respectively: Analogously, the estimate of the remapped location of a visual stimulus is: where w indexes the visual stimulus location (s V ∈ {−12, −4, 4, 12 • }). Given that the measurement distributions are Gaussian and the family of Gaussian distributions is closed under linear transformations, the probability distribution of the location estimates of a test stimulus iŝ where Analogously, for the location estimates of the visual standard stimulus, where , and σ 2

S6.2 The psychometric functions for the bimodal spatial-discrimination task
In this task, visual stimuli were presented at four different locations s V,w , where w indexes the visual stimulus location (s V ∈ {−12, −4, 4, 12 • }). In each trial, s V,w was paired with an auditory stimulus, at one of test locations s A,o , where o indexes the finer grid of auditory locations. For each pair, the model predicts p w,o , the probability of estimating the auditory test stimulus s A,o to be located to the right of the visual standard stimulus at location s V,w . We assume that the observer makes the decision by comparing the internal location estimatesŝ A,o andŝ V,w , Given Eqs. S15-S16, the probability distribution of the difference between the two location estimatesŝ A,o andŝ V,w can be written aŝ Taken together, the probability of perceiving an auditory test stimulus at presented location s A,o to the right of a visual standard stimulus presented at location s V,w is We additionally assumed that participants lapsed occasionally at rate λ AV . Therefore, the probability of reporting an auditory test stimulus as being farther to the right of a visual test stimulus is equal to In sum, the probability of reporting the auditory stimulus as to the right of the visual stimulus as a function of the distance between the two stimulus locations is described by a cumulative Gaussian distribution that approaches λ AV /2 when the auditory stimulus is located far to the left and 1 − λ AV /2 when it is far to the right of the visual stimulus.

16/38
S7 Estimated parameters given the best-fitting model S7.1 All relevant free model parameters Table S5. Estimated parameters given the best-fitting model for each participants. Note that the elements in −−→ p C=1 follows the order of pre-learning, congruent post-learning, and incongruent post-learning if the best-fitting model is the high-plasticity, short-lasting changes model. −−→ p C=1 follows the order of congruent pre-learning, congruent post-learning, incongruent pre-learning, incongruent post-learning if the best-fitting model is the high-plasticity, long-lasting changes model. The elements in − −− → λ unity follows this order: congruent and incongruent.

Participant
Estimated parameters given the best-fitting model  Table S7. Estimated common-cause priors, changes in the prior, and negative log likelihood of model parameters for participants whose data were best explained by the high-plasticity, short-lasting changes model. Brackets include 95% bootstrapped confidence intervals.  Table S8. Estimated common-cause priors, changes in the prior, and negative log likelihood of model parameters for participants whose data were best explained by the high-plasticity, long-lasting changes model.  Table S9. Estimated common-cause priors and negative log likelihood of model parameters for participants whose data were best explained by the no-plasticity model.

Predicted probability distributions for localization responses
Given the best-fitting model and model parameters, for each participant, condition (i = 1: congruent; i = 2: incongruent), phase ( j = 1: pre-learning; j = 2: post-learning) and each audiovisual stimulus pair AV l (l ∈ {1, 2, ..., 16}), we computed the joint probability distribution, P(m AV l ,A , m AV l ,V |s i, j,AV l ) by taking the outer product of the probability of visual measurements and the probability of auditory measurements, that is, where M A and M V are random variables for auditory and visual measurements. Their ranges were selected to extend 5 standard deviations of the sensory noise away from the remapped stimulus location in perceptual space, that is, from s AV l ,A − 5σ AV,A to s AV l ,A + 5σ AV,A for auditory measurements, and from s AV l ,V − 5σ AV,V to s AV l ,V + 5σ AV,V for visual measurements. Both variables have 100 evenly spaced discrete values, resulting in a 100 × 100 grid (Fig. S6A). leftmost and rightmost final location estimates and calculated the range, denoted as max(ŝ AV l ,A ) − min(ŝ AV l ,A ) and max(ŝ AV l ,V ) − min(ŝ AV l ,V ), respectively. We defined two new random variables, R A and R V , such that P(R A = r A |s AV l ) = 100 ∑ m=1 100 ∑ n=1 φ r A ;ŝ AV l ,A (m, n), σ r P m AV l ,A (m), m AV l ,V (n)|s AV l (S26) and analogously, where m and n index the discrete value for remapped auditory and visual measurements, respectively, and r A ranges from 2 min(ŝ AV l ,A ) − max(ŝ AV l ,A ) to 2 max(ŝ AV l ,A ) − min(ŝ AV l ,A ) with an increment of 0.1. σ r represents the variability in responses due to loss of precision in memory or the use of the response device (Eq. 12). In other words, the probability distribution of localization responses is the sum, across all possible combinations of auditory and visual measurements, of Gaussian distributions centered at the intended response location, with the variability being the motor/memory noise, weighted by the joint probability of the corresponding auditory and visual measurements (Fig. S6C).
Given the probability distribution of localization responses, we computed the model-predicted ventriloquism effects (VE) as follows: and where µŝ A and µŝ V represent the mean perceived auditory stimulus locations when they were presented alone at location s AV l ,A and s AV l ,V , respectively (Eq. 14). In other words, the model-predicted ventriloquism effect is the difference between the expected value of the localization response for a given audiovisual stimulus pair s AV l and the mean perceived stimulus location when each element of s AV l was presented alone. The difference was coded as positive if the expected value of the localization response shifts in the direction of the other stimulus modality.

S9.2 Predicted proportion of reporting 'common cause'
Given each combination of auditory and visual measurements, the model-predicted probability of reporting 'common cause', P(I C=1 = 1|m AV l ,A , m AV l ,V ), was computed using Eq. 11 if the best-fitting decision strategy is based on a comparison of the distance between auditory and visual final location estimates (Fig. S7A) and a participant-specific internal criterion. Specifically, if the distance is smaller than the internal criterion, then participants report 'common cause'; otherwise, they report 'separate causes' (Fig. S7B; Eq. S4 for the posterior-based decision strategy, and Eq. S5 for the measurements-based decision strategy).

23/38
The probability of reporting 'common cause' given an audiovisual stimulus pair s AV l , P(I C=1 = 1|s AV l ) is computed as follows: In other words, P(I C=1 = 1|s AV l ) is the sum, across all possible combinations of auditory and visual measurements, of the probability of reporting 'common cause' by taking lapse rate into account, weighted by the joint probability of the corresponding auditory and visual measurements (Fig. S7C).

S9.3 Confidence intervals for model-predicted ventriloquism effects using parametric bootstrapping
To derive 95% confidence intervals on model-predicted ventriloquism effects, we used a method called parametric bootstrapping. Specifically, given the best-fitting model and model parameters, we first computed the probability distribution of auditory and visual localization responses for each audiovisual stimulus pair s AV l , phase, condition, and each participant (Eqs. 9-10, S13-S14). Then, we repeatedly drew ten localization responses from the probability distribution for each stimulus modality, r AV l ,A and r AV l ,V (Fig. S8A), and repeated this procedure 1000 times. Additionally, we repeatedly drew 30 unimodal localization responses for an auditory stimulus presented at r AV l ,A and for a visual stimulus presented at r AV l ,V , respectively, based on closed-form formulas (Eq. 13), and computed the mean unimodal localization responses across trials for each of 1000 simulation runs (Fig. S8B). The number of trials for each simulation run was selected to match the actual number of trials used in our study.
We then calculated the auditory and visual shifts by subtracting the mean unimodal localization response (Fig. S8B) from the bimodal localization responses (Fig. S8A), and at the end took an average of the shifts across trials for each simulation run. We repeated this procedure for every audiovisual stimulus pair (l ∈ {1, 2, ..., 16}), and averaged the shifts of stimulus pairs that correspond to the same spatial discrepancy (e.g., four pairs have a spatial discrepancy of 0 3 ], and [s V = 12 • , s A,4 ]; Fig. S8C, left panel). Ventriloquism effects are almost equivalent to the auditory and visual shifts, except that shifts towards the other stimulus modality were coded as positive (Fig. S8C, middle panel). Finally, we averaged the ventriloquism effects of stimulus pairs that correspond to the same absolute spatial discrepancy, sorted them independently for each absolute spatial discrepancy in ascending order, and computed the lower and upper bounds of the 95% confidence intervals, which correspond to the 2.5th and 97.5th percentiles (Fig. S8C, right panel).  Figure S9. The procedure for computing 95% confidence intervals on proportions of reporting 'common cause'. Left panel: simulated binary responses (white: a 'common cause' response; black: a 'separate causes' response) for the audiovisual stimulus pair with a spatial discrepancy of −16 • ; middle panel: proportions of reporting 'common cause' as a function of spatial discrepancy; right panel: averaged proportions of reporting 'common cause' for audiovisual stimulus pairs that have the same absolute spatial discrepancy. The lower and upper bounds of the 95% confidence intervals correspond to the 2.5% and 97.5% percentiles.

S9.4 Confidence intervals for model-predicted proportions of reporting 'common cause' using parametric bootstrapping
To derive 95% confidence intervals on the proportion of reporting 'common cause', we again used parametric bootstrapping. Specifically, given the best-fitting model and model parameters, we first computed the proportion of reporting 'common cause'

24/38
for each audiovisual stimulus pair s AV l , phase, condition, and each participant (Eq. S30), and then averaged the proportions of reporting 'common cause' for audiovisual stimulus pairs that correspond to the same spatial discrepancy. Then, we drew N binary judgments (Fig. S9, left panel; I C=1 = 1: common cause; I C=1 = 0: separate causes) for each level of spatial discrepancy, where N represents the number of trials for each spatial discrepancy (N = 20, 40, 60, 80, 60, 40, 20 for spatial discrepancies of -24, -16, -8, 0, 8, 16, 24 • ). We repeated this procedure 1000 times. We computed the proportion of reporting 'common cause' across trials for each simulation run and each spatial discrepancy (Fig. S9, middle panel). Finally, we averaged the proportions of reporting 'common cause' of stimulus pairs that correspond to the same absolute spatial discrepancy, sorted them independently for each absolute spatial discrepancy in ascending order, and computed the lower and upper bounds of the 95% confidence intervals, which correspond to the 2.5th and 97.5th percentiles (Fig. S9, right panel). 25/38 S10 Simulated posterior probabilities of a common cause and changes of the commoncause prior given varying sensory measurement noise S10.1 The posterior probability of a common cause Table S10. Free parameters and their values for simulations shown in Fig. 6A The center of a supra-modal prior over stimulus location 0 σ P The variability of a supra-modal prior over stimulus location 100 S10.2 The updates of the common-cause prior

Variability of visual location measurements (congruent learning condition)
Changes in the common-cause prior Variability of auditory location measurements (incongruent learning condition)

Variability of visual location measurements (incongruent learning condition)
Changes in the common-cause prior Figure S10. Simulations results for the congruent (A) and incongruent (B) learning conditions. Color key: accumulated changes in the common-cause prior, averaged across 100 simulations of repeated exposure to congruent/incongruent audiovisual stimulus pairs, given different variabilities of auditory (vertical axis) and visual (horizontal axis) location measurements, different residual perceptual biases, and different common-cause priors the simulated participant started with at the beginning the the learning phase (top row: p C=1 = 0.27; bottom row: p C=1 = 0.87).